Automatic Detection of Sociolinguistic Variation Using Forced Alignment

نویسنده

  • George Bailey
چکیده

Forced alignment software is now widely used in contemporary sociolinguistics, and is quickly becoming a crucial methodological tool as an increasing number of studies begin to utilise ‘big data.’ This study investigates the possibility of taking forced alignment one step further towards the goal of complete automation; specifically, it expands the functionality of FAVE-align to fully automate the coding of three sociolinguistic variables in British English: (th)-fronting, (td)-deletion, and (h)-dropping. This involved the expansion of pronouncing dictionaries to reflect the surface output of these variable rules; FAVE then compares the fit of competing acoustic models with the speech signal to determine the surface variant. It does so with an impressive degree of accuracy, largely comparable to inter-transcriber agreement for all variables; however, the pattern of its mistakes, which are largely false positives, suggests a difficulty in identifying the voiceless segments of (td) and (th). Although it is reassuring that inter-transcriber agreement was also lowest for these tokens, it should be noted that FAVE’s accuracy decreases in faster speech rates while no comparable effect is found for agreement among human transcribers. This working paper is available in University of Pennsylvania Working Papers in Linguistics: http://repository.upenn.edu/pwpl/ vol22/iss2/3 Automatic Detection of Sociolinguistic Variation Using Forced Alignment

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large-scale analysis of Spanish /s/-lenition using audiobooks

Given forced alignment and accurate automatic phonetic classification and measurement, audiobooks are an important potential source of large-scale evidence about phonetic variation. For example, the audiobook version of the novel La Casa de los Espiritus, read by two Chilean actors, presents 17 hours of audio containing nearly 68,000 /s/ segments, distributed in a natural way across a wide vari...

متن کامل

Stylistic variation and social networks 1

Although stylistic variation within social networks has been described in adults, this topic remains under-researched in children. One question that remains unanswered is the extent to which stylistic variation is the result of automatic alignment or of intentional, pragmatically motivated adjustment. We present an in-depth sociolinguistic case study of a 10-year-old boy, his family and four fr...

متن کامل

Automatic detection of "g-dropping" in American English using forced alignment

This study investigated the use of forced alignment for automatic detection of “g-dropping” in American English (e.g., walkin'). Two acoustic models were trained, one for -in' and the other for -ing. The models were added to the Penn Phonetics Lab Forced Aligner, and forced alignment will choose the more probable pronunciation from the two alternatives. The agreement rates between the forced al...

متن کامل

Comparison of forced-alignment speech recognition and humans for generating reference VAD

This present paper aims to answer the question whether forced-alignment speech recognition can be used as an alternative to humans in generating reference Voice Activity Detection (VAD) transcriptions. An investigation of the level of agreement between automatic/manual VAD transcriptions and the reference ones produced by a human expert was carried out. Thereafter, statistical analysis was empl...

متن کامل

Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners

Shadowing is a task where the subject is required to repeat the presented speech as s/he hears it. Although shadowing is cognitively a challenging task, it is considered as an efficient way of language training since it includes processes of listening, speaking and comprehension simultaneously. Our previous study realized automatic assessment of shadowing speech using the average of Goodness of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016